synthetic sample
Self-Consuming Generative Models with Curated Data Provably Optimize Human Preferences Damien Ferbach 1, 2, Quentin Bertrand 1, A vishek Joey Bose
The rapid progress in generative models has resulted in impressive leaps in generation quality, blurring the lines between synthetic and real data. Web-scale datasets are now prone to the inevitable contamination by synthetic data, directly impacting the training of future generated models. Already, some theoretical results on self-consuming generative models (a.k.a., iterative retraining) have emerged in the literature, showcasing that either model collapse or stability could be possible depending on the fraction of generated data used at each retraining step. However, in practice, synthetic data is often subject to human feedback and curated by users before being used and uploaded online. For instance, many interfaces of popular text-to-image generative models, such as Stable Diffusion or Midjourney, produce several variations of an image for a given query which can eventually be curated by the users. In this paper, we theoretically study the impact of data curation on iterated retraining of generative models and show that it can be seen as an implicit preference optimization mechanism .
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.67)
Improving the Training of Rectified Flows
One approach for tackling this problem is rectified flows, which iteratively learn smooth ODE paths that are less susceptible to truncation error. However, rectified flows still require a relatively large number of function evaluations (NFEs). In this work, we propose improved techniques for training rectified flows, allowing them to compete with knowledge distillation methods even in the low NFE setting.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.68)
Locally Interpretable Individualized Treatment Rules for Black-Box Decision Models
Charvadeh, Yasin Khadem, Panageas, Katherine S., Chen, Yuan
Existing methods typically rely on either interpretable but inflexible models or highly flexible black-box approaches that sacrifice interpretability; moreover, most impose a single global decision rule across patients. We introduce the Locally Interpretable Individualized Treatment Rule (LI-ITR) method, which combines flexible machine learning models to accurately learn complex treatment outcomes with locally interpretable approximations to construct subject-specific treatment rules. LI-ITR employs variational autoencoders to generate realistic local synthetic samples and learns individualized decision rules through a mixture of interpretable experts. Simulation studies show that LI-ITR accurately recovers true subject-specific local coefficients and optimal treatment strategies. An application to precision side-effect management in breast cancer illustrates the necessity of flexible predictive modeling and highlights the practical utility of LI-ITR in estimating optimal treatment rules while providing transparent, clinically interpretable explanations.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- Oceania > New Zealand (0.04)
- (11 more...)
- Energy > Power Industry (0.93)
- Health & Medicine (0.67)
- Energy > Renewable > Solar (0.47)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > California (0.04)
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)